GATE PYQ

Computer Organization

Q221.
A processor takes 12 cycles to complete an instruction I. The corresponding pipelined processor uses 6 stages with the execution times of 3, 2, 5, 4, 6 and 2 cycles respectively. What is the asymptotic speedup assuming that a very large number of instructions are to be executed?

Q222.
Data forwarding techniques can be used to speed up the operation in presence of data dependencies. Consider the following replacements of LHS with RHS. i. R1\rightarrow Loc, Loc\rightarrow R2 \; \equiv \; R1\rightarrow R2, R1 \rightarrow Loc ii. R1\rightarrow Loc, Loc\rightarrow R2 \; \equiv \; R1\rightarrow R2 iii. R1\rightarrow Loc, R2 \rightarrow Loc \; \equiv \; R1\rightarrow Loc iv. R1\rightarrow Loc, R2 \rightarrow Loc \; \equiv \; R2\rightarrow Loc In which of the following options, will the result of executing the RHS be the same as executing the LHS irrespective of the instructions that follow ?

Q223.
Consider the following reservation table for a pipeline having three stages S1,S2 and S3. The minimum average latency (MAL) is ________.

Q224.
Consider a pipelined processor with the following four stages: IF: Instruction Fetch ID: Instruction Decode and Operand Fetch EX: Execute WB: Write Back The IF, ID and WB stages take one clock cycle each to complete the operation. The number of clock cycles for the EX stage depends on the instruction. The ADD and SUB instructions need 1 clock cycle and the MUL instruction needs 3 clock cycles in the EX stage. Operand forwarding is used in the pipelined processor. What is the number of clock cycles taken to complete the following sequence of instructions? \begin{array}{lllll} \textbf{ADD} & \text{R2, R1, R0} &&& \text{R2 $\leftarrow$ R1+R0} \\ \textbf{MUL} & \text{R4, R3, R2} &&& \text{R4 $\leftarrow$ R3*R2} \\ \textbf{SUB} & \text{R6, R5, R4} &&& \text{R6 $\leftarrow$ R5-R4} \\ \end{array}

Q225.
We have two designs D1 and D2 for a synchronous pipeline processor. D1 has 5 pipeline stages with execution times of 3 nsec, 2 nsec, 4 nsec, 2 nsec and 3 nsec while the design D2 has 8 pipeline stages each with 2 nsec execution time How much time can be saved using design D2 over design D1 for executing 100 instructions?

Q226.
Delayed branching can help in the handling of control hazards The following code is to run on a pipelined processor with one branch delay slot: I1: ADD \leftarrowR2 R7 +R8 I2 : SUB R4 \leftarrowR5 - R6 I3: ADD R1 \leftarrow R2 + R3 I4 : STORE Memory [R4] \leftarrow R1 BRANCH to Label if R1==0 Which of the instructions I1, I2, I3 or I4 can legitimately occupy the delay slot without any other program modification?

Q227.
Delayed branching can help in the handling of control hazards For all delayed conditional branch instructions, irrespective of whether the condition evaluates to true or false

Q228.
Consider an instruction pipeline with four stages (S1, S2, S3 and S4) each with combinational circuit only. The pipeline registers are required between each stage and at the end of the last stage. Delays for the stages and for the pipeline registers are as given in the figure. What is the approximate speed up of the pipeline in steady state under ideal conditions when compared to the corresponding non-pipeline implementation?

Q229.
A pipeline P operating at 400 MHz has a speedup factor of 6 and operating at 70% efficiency. How many stages are there in the pipeline?

Q230.
A non pipelined single cycle processor operating at 100 MHz is converted into a synchronous pipelined processor with five stages requiring 2.5 nsec, 1.5 nsec, 2 nsec, 1.5 nsec and 2.5 nsec, respectively. The delay of the latches is 0.5 nsec. The speedup of the pipeline processor for a large number of instructions is: